Skip to content

[Clang] Add __builtin_invoke and use it in libc++ #116709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jun 29, 2025

Conversation

philnik777
Copy link
Contributor

@philnik777 philnik777 commented Nov 18, 2024

std::invoke is currently quite heavy compared to a function call, since it involves quite heavy SFINAE. This can be done significantly more efficient by the compiler, since most calls to std::invoke are simple function calls and 6 out of the seven overloads for std::invoke exist only to support member pointers. Even these boil down to a few relatively simple checks.

Some real-world testing with this patch revealed some significant results. For example, instantiating std::format("Banane") (and its callees) went down from ~125ms on my system to ~104ms.

Copy link

github-actions bot commented Nov 18, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@philnik777
Copy link
Contributor Author

philnik777 commented Nov 22, 2024

@ldionne I think this is the more appropriate place to discuss this.

(From #116637 (comment))

Chris had a patch on Clang at some point where we discussed this and I voiced concerns. Here it is: https://reviews.llvm.org/D130867

Here are some of my concerns:

  1. Implementing more stuff in the compiler adds complexity to the compiler, and does not decrease the complexity of the library because libc++ still needs to implement invoke on other compilers.

  2. It duplicates the tests, since we need to test the builtin and the standard library interface to the same level.

  3. Duplicating functionality in libc++ and in the compiler is confusing. For instance, it sets up the stage for people not knowing where a bug in std::invoke should be fixed. Is it in the compiler? In the library? In both? And the set of folks who can fix a bug in the compiler is a lot smaller than the set of folks who can do that in the library.

  4. It creates another level of coupling between the library and the compiler version-wise. For example, if we have a LWG issue fixing std::invoke, we must now fix it in Clang and whether the issue is fixed is not something we can determine from the library, it becomes something that depends on the compiler. This is technically a pre-existing issue with the builtin type traits, but in practice the type traits have a simple API and they don't change much, unlike std::invoke which has a complex simplification specification and could be the target of changes.

  1. I'm not convinced the library part is true. The reality is that we support Clang and GCC, and if they both support the builtins (or provide different ones for the same feature) we remove our fallback implementations and thus reducing the complexity for libc++. GCC 15 already adds __is_invocable, so we would be able to at least simplify our traits. Whether the complexity is too high for the compiler is something the compiler maintainers should judge IMO. FWIW IMO The compiler implementations is actually much easier to understand than the mess we have in the library.

  2. True. I'm not sure this is a huge burden though.

  3. IMO that's not the case. We call a builtin. If that builtin is wrong we file a bug against the compiler. I don't really see the confusion here. It's exactly the same as with type traits.

  4. I don't think this is just "technically pre-existing". There are bugs in the compiler builtins. In some cases we've waited until these bugs are fixed to use a builtin trait, but when the bug was pre-existing for a significant amount of time we didn't bother trying to work around it in the library. I'm not sure what you mean by "complex simplification".

@ldionne
Copy link
Member

ldionne commented Nov 22, 2024

  1. I'm not convinced the library part is true. The reality is that we support Clang and GCC, and if they both support the builtins (or provide different ones for the same feature) we remove our fallback implementations and thus reducing the complexity for libc++. GCC 15 already adds __is_invocable, so we would be able to at least simplify our traits. Whether the complexity is too high for the compiler is something the compiler maintainers should judge IMO. FWIW IMO The compiler implementations is actually much easier to understand than the mess we have in the library.

Does GCC implement (or have plans to implement) __builtin_invoke?

  1. True. I'm not sure this is a huge burden though.

Well, our tests for std::invoke are not simple, and we'd have to do the same in Clang if we want to have the same amount of coverage.

  1. IMO that's not the case. We call a builtin. If that builtin is wrong we file a bug against the compiler. I don't really see the confusion here. It's exactly the same as with type traits.

Your PR description says that the compiler recognizes std::invoke itself as a builtin, is that right? It makes a difference to me whether std::invoke is a builtin directly or std::invoke calls __builtin_invoke.

  1. I don't think this is just "technically pre-existing". There are bugs in the compiler builtins. In some cases we've waited until these bugs are fixed to use a builtin trait, but when the bug was pre-existing for a significant amount of time we didn't bother trying to work around it in the library. I'm not sure what you mean by "complex simplification".

Sorry, I meant complex specification. I mean that invoke is not like is_constructible -- it may be updated in the future. The updates would be to the library wording, and I'd find it really weird to have to go change the compiler builtin to account for what's effectively a library change.

@philnik777
Copy link
Contributor Author

  1. I'm not convinced the library part is true. The reality is that we support Clang and GCC, and if they both support the builtins (or provide different ones for the same feature) we remove our fallback implementations and thus reducing the complexity for libc++. GCC 15 already adds __is_invocable, so we would be able to at least simplify our traits. Whether the complexity is too high for the compiler is something the compiler maintainers should judge IMO. FWIW IMO The compiler implementations is actually much easier to understand than the mess we have in the library.

Does GCC implement (or have plans to implement) __builtin_invoke?

Not that I'm aware of.

  1. True. I'm not sure this is a huge burden though.

Well, our tests for std::invoke are not simple, and we'd have to do the same in Clang if we want to have the same amount of coverage.

AFAICT we have one test for invoke plus a constexpr test which should really just be rolled into the generic invoke test. That test is ~400 LoC, which is definitely not nothing, but also not unmanageable IMO. We could probably also simplify that by not explicitly listing every cv combination but instead using types::for_each.

  1. IMO that's not the case. We call a builtin. If that builtin is wrong we file a bug against the compiler. I don't really see the confusion here. It's exactly the same as with type traits.

Your PR description says that the compiler recognizes std::invoke itself as a builtin, is that right? It makes a difference to me whether std::invoke is a builtin directly or std::invoke calls __builtin_invoke.

Yes, it does. If that is the main point of contention we can move that into a separate patch. I don't expect the compile times to differ significantly between recognizing std::invoke as a builtin itself and having std::invoke use __builtin_invoke as the implementation. It would still be nice for improved debug performance and reduced debug info though.

What exactly is the difference you perceive between the two? I mean, sure, there is one, but if std::invoke just always calls __builtin_invoke there really isn't a meaningful one.

  1. I don't think this is just "technically pre-existing". There are bugs in the compiler builtins. In some cases we've waited until these bugs are fixed to use a builtin trait, but when the bug was pre-existing for a significant amount of time we didn't bother trying to work around it in the library. I'm not sure what you mean by "complex simplification".

Sorry, I meant complex specification. I mean that invoke is not like is_constructible -- it may be updated in the future. The updates would be to the library wording, and I'd find it really weird to have to go change the compiler builtin to account for what's effectively a library change.

Is your main concern here that we'd be limited in the library what we can change without the changing the compiler? If that is the case, would your concerns be alleviated if we could opt out of the builtin recognition for specific overloads, e.g. through an attribute?

@ldionne
Copy link
Member

ldionne commented Nov 26, 2024

AFAICT we have one test for invoke plus a constexpr test which should really just be rolled into the generic invoke test. That test is ~400 LoC, which is definitely not nothing, but also not unmanageable IMO. We could probably also simplify that by not explicitly listing every cv combination but instead using types::for_each.

As long as libc++ keeps testing its public API fully, I'm OK with this. What I don't want is to start shifting test burden towards Clang because the compiler actually implements most of the functionality. These builtins should be implementation details, so we should still be testing the full public API in libc++ itself.

The testing question from the Clang side is interesting and I would expect Clang folks to have an opinion. From the Clang side, I'd expect some concerns about adding a builtin with such a complex and user-visible "API". But those concerns are not really for me to raise, if Clang folks are fine with it, that's not an issue for me.

Yes, it does. If that is the main point of contention we can move that into a separate patch. I don't expect the compile times to differ significantly between recognizing std::invoke as a builtin itself and having std::invoke use __builtin_invoke as the implementation. It would still be nice for improved debug performance and reduced debug info though.

What exactly is the difference you perceive between the two? I mean, sure, there is one, but if std::invoke just always calls __builtin_invoke there really isn't a meaningful one.

[...]

Is your main concern here that we'd be limited in the library what we can change without the changing the compiler? If that is the case, would your concerns be alleviated if we could opt out of the builtin recognition for specific overloads, e.g. through an attribute?

As discussed just now in the libc++ monthly:

  • Silently hijacking a libc++ function by the compiler is surprising and certainly unexpected unless you already know what's going on. I can imagine someone trying to fix a bug in std::invoke and spending hours scratching their heads not understanding why std::invoke is still not doing the right thing after their changes, only to find out that the compiler was really implementing the function under the hood. These things should be explicit.
  • Silently hijacking also means that we lose control over what's happening from the library side, so a bug or an improvement requires changes to the compiler.

My view is that compiler builtins should be tools to implement an API more simply and more efficiently. They shouldn't be the API itself.

Personally, I think the best way to move this forward would be to remove the hijacking from this patch to make it less contentious, and then to potentially pursue an attribute to formalize how hijacking works. This could (and should) be used beyond std::invoke, for example with std::forward and std::move. That would actually be an improvement to the status-quo in relation to my concerns with "discoverability".

@philnik777 philnik777 changed the title [Clang] Add __builtin_invoke and detect std::invoke as a builtin [Clang] Add __builtin_invoke Dec 14, 2024
@philnik777 philnik777 changed the title [Clang] Add __builtin_invoke [Clang] Add __builtin_invoke and use it in libc++ Dec 14, 2024
@philnik777 philnik777 force-pushed the builtin_invoke branch 2 times, most recently from ddcc918 to c52c8f3 Compare March 30, 2025 09:40
@philnik777 philnik777 marked this pull request as ready for review May 20, 2025 12:16
@philnik777 philnik777 requested a review from a team as a code owner May 20, 2025 12:16
@llvmbot llvmbot added clang Clang issues not falling into any other category libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. clang:frontend Language frontend issues, e.g. anything involving "Sema" labels May 20, 2025
@llvmbot
Copy link
Member

llvmbot commented May 20, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

std::invoke is currently quite heavy compared to a function call, since it involves quite heavy SFINAE. This can be done significantly more efficient by the compiler, since most calls to std::invoke are simple function calls and 6 out of the seven overloads for std::invoke exist only to support member pointers. Even these boil down to a few relatively simple checks.

Some real-world testing with this patch revealed some significant results. For example, instantiating std::format("Banane") (and its callees) went down from ~125ms on my system to ~104ms.


Patch is 29.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116709.diff

9 Files Affected:

  • (modified) clang/include/clang/Basic/Builtins.td (+6)
  • (modified) clang/include/clang/Sema/Sema.h (+9)
  • (modified) clang/lib/Parse/ParseDeclCXX.cpp (+1-23)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+97)
  • (modified) clang/lib/Sema/SemaExprCXX.cpp (+54-51)
  • (added) clang/test/CodeGenCXX/builtin-invoke.cpp (+61)
  • (added) clang/test/SemaCXX/builtin-invoke.cpp (+133)
  • (modified) libcxx/include/__type_traits/invoke.h (+127-28)
  • (modified) libcxx/include/__type_traits/is_core_convertible.h (+11)
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 187d3b5ed24a7..58cc35088c40a 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -4272,6 +4272,12 @@ def MoveIfNsoexcept : CxxLibBuiltin<"utility"> {
   let Namespace = "std";
 }
 
+def Invoke : Builtin {
+  let Spellings = ["__builtin_invoke"];
+  let Attributes = [CustomTypeChecking, Constexpr];
+  let Prototype = "void(...)";
+}
+
 def Annotation : Builtin {
   let Spellings = ["__builtin_annotation"];
   let Attributes = [NoThrow, CustomTypeChecking];
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 5ec67087aeea4..22d66e8688906 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -2594,6 +2594,8 @@ class Sema final : public SemaBase {
                                SourceLocation BuiltinLoc,
                                SourceLocation RParenLoc);
 
+  ExprResult BuiltinInvoke(CallExpr *TheCall);
+
   static StringRef GetFormatStringTypeName(FormatStringType FST);
   static FormatStringType GetFormatStringType(StringRef FormatFlavor);
   static FormatStringType GetFormatStringType(const FormatAttr *Format);
@@ -15220,11 +15222,18 @@ class Sema final : public SemaBase {
                                SourceLocation Loc);
   QualType BuiltinRemoveReference(QualType BaseType, UTTKind UKind,
                                   SourceLocation Loc);
+
+  QualType BuiltinRemoveCVRef(QualType BaseType, SourceLocation Loc) {
+    return BuiltinRemoveReference(BaseType, UTTKind::RemoveCVRef, Loc);
+  }
+
   QualType BuiltinChangeCVRQualifiers(QualType BaseType, UTTKind UKind,
                                       SourceLocation Loc);
   QualType BuiltinChangeSignedness(QualType BaseType, UTTKind UKind,
                                    SourceLocation Loc);
 
+  bool BuiltinIsBaseOf(SourceLocation RhsTLoc, QualType LhsT, QualType RhsT);
+
   /// Ensure that the type T is a literal type.
   ///
   /// This routine checks whether the type @p T is a literal type. If @p T is an
diff --git a/clang/lib/Parse/ParseDeclCXX.cpp b/clang/lib/Parse/ParseDeclCXX.cpp
index 316bc30edf1f0..aeb1112bad8b4 100644
--- a/clang/lib/Parse/ParseDeclCXX.cpp
+++ b/clang/lib/Parse/ParseDeclCXX.cpp
@@ -1611,29 +1611,7 @@ void Parser::ParseClassSpecifier(tok::TokenKind TagTokKind,
       Tok.isOneOf(
 #define TRANSFORM_TYPE_TRAIT_DEF(_, Trait) tok::kw___##Trait,
 #include "clang/Basic/TransformTypeTraits.def"
-          tok::kw___is_abstract,
-          tok::kw___is_aggregate,
-          tok::kw___is_arithmetic,
-          tok::kw___is_array,
-          tok::kw___is_assignable,
-          tok::kw___is_base_of,
-          tok::kw___is_bounded_array,
-          tok::kw___is_class,
-          tok::kw___is_complete_type,
-          tok::kw___is_compound,
-          tok::kw___is_const,
-          tok::kw___is_constructible,
-          tok::kw___is_convertible,
-          tok::kw___is_convertible_to,
-          tok::kw___is_destructible,
-          tok::kw___is_empty,
-          tok::kw___is_enum,
-          tok::kw___is_floating_point,
-          tok::kw___is_final,
-          tok::kw___is_function,
-          tok::kw___is_fundamental,
-          tok::kw___is_integral,
-          tok::kw___is_interface_class,
+          tok::kw___is_convertible, // Last use in libc++ was removed in 925a11a
           tok::kw___is_literal,
           tok::kw___is_lvalue_expr,
           tok::kw___is_lvalue_reference,
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index a960b9931ddfd..26579de25bdf0 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -2368,6 +2368,8 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
     return BuiltinShuffleVector(TheCall);
     // TheCall will be freed by the smart pointer here, but that's fine, since
     // BuiltinShuffleVector guts it, but then doesn't release it.
+  case Builtin::BI__builtin_invoke:
+    return BuiltinInvoke(TheCall);
   case Builtin::BI__builtin_prefetch:
     if (BuiltinPrefetch(TheCall))
       return ExprError();
@@ -5406,6 +5408,101 @@ ExprResult Sema::ConvertVectorExpr(Expr *E, TypeSourceInfo *TInfo,
                                    RParenLoc, CurFPFeatureOverrides());
 }
 
+ExprResult Sema::BuiltinInvoke(CallExpr *TheCall) {
+  auto Loc = TheCall->getBeginLoc();
+  auto Args = MutableArrayRef(TheCall->getArgs(), TheCall->getNumArgs());
+  assert(llvm::none_of(Args,
+                       [](Expr *Arg) { return Arg->isTypeDependent(); }));
+
+  if (Args.size() == 0) {
+    Diag(TheCall->getBeginLoc(), diag::err_typecheck_call_too_few_args_at_least)
+        << 0 << 1 << 0 << 0 << TheCall->getSourceRange();
+    return ExprError();
+  }
+
+  auto FuncT = Args[0]->getType();
+
+  if (auto *MPT = FuncT->getAs<MemberPointerType>()) {
+    if (Args.size() < 2) {
+      Diag(TheCall->getBeginLoc(),
+            diag::err_typecheck_call_too_few_args_at_least)
+          << 0 << 2 << 1 << 0 << TheCall->getSourceRange();
+      return ExprError();
+    }
+
+    auto *MemPtrClass = MPT->getQualifier()->getAsType();
+    auto ObjectT = Args[1]->getType();
+
+
+    if (MPT->isMemberDataPointer() && Args.size() != 2) {
+      Diag(TheCall->getBeginLoc(), diag::err_typecheck_call_too_many_args)
+          << 0 << 2 << Args.size() << 0 << TheCall->getSourceRange();
+      return ExprError();
+    }
+
+    ExprResult ObjectArg = [&]() -> ExprResult {
+      // (1.1): (t1.*f)(t2, …, tN) when f is a pointer to a member function of a
+      // class T and is_same_v<T, remove_cvref_t<decltype(t1)>> ||
+      // is_base_of_v<T, remove_cvref_t<decltype(t1)>> is true;
+      // (1.4): t1.*f when N=1 and f is a pointer to data member of a class T
+      // and is_same_v<T, remove_cvref_t<decltype(t1)>> ||
+      // is_base_of_v<T, remove_cvref_t<decltype(t1)>> is true;
+      if (Context.hasSameType(QualType(MemPtrClass, 0),
+                              BuiltinRemoveCVRef(ObjectT, Loc)) ||
+          BuiltinIsBaseOf(Args[1]->getBeginLoc(), QualType(MemPtrClass, 0),
+                          BuiltinRemoveCVRef(ObjectT, Loc))) {
+        return Args[1];
+      }
+
+      // (t1.get().*f)(t2, …, tN) when f is a pointer to a member function of
+      // a class T and remove_cvref_t<decltype(t1)> is a specialization of
+      // reference_wrapper;
+      if (auto *RD = ObjectT->getAsCXXRecordDecl()) {
+        if (RD->isInStdNamespace() &&
+            RD->getDeclName().getAsString() == "reference_wrapper") {
+          CXXScopeSpec SS;
+          IdentifierInfo *GetName = &Context.Idents.get("get");
+          UnqualifiedId GetID;
+          GetID.setIdentifier(GetName, Loc);
+
+          auto MemExpr = ActOnMemberAccessExpr(
+              getCurScope(), Args[1], Loc, tok::period, SS,
+              /*TemplateKWLoc=*/SourceLocation(), GetID, nullptr);
+
+          if (MemExpr.isInvalid())
+            return ExprError();
+
+          return ActOnCallExpr(getCurScope(), MemExpr.get(), Loc, {}, Loc);
+        }
+      }
+
+      // ((*t1).*f)(t2, …, tN) when f is a pointer to a member function of a
+      // class T and t1 does not satisfy the previous two items;
+
+      return ActOnUnaryOp(getCurScope(), Loc, tok::star, Args[1]);
+    }();
+
+    if (ObjectArg.isInvalid())
+      return ExprError();
+
+    auto BinOp = ActOnBinOp(getCurScope(), TheCall->getBeginLoc(),
+                            tok::periodstar, ObjectArg.get(), Args[0]);
+    if (BinOp.isInvalid())
+      return ExprError();
+
+    if (MPT->isMemberDataPointer())
+      return BinOp;
+
+    auto *MemCall = new (Context)
+        ParenExpr(SourceLocation(), SourceLocation(), BinOp.get());
+
+    return ActOnCallExpr(getCurScope(), MemCall, TheCall->getBeginLoc(),
+                         Args.drop_front(2), TheCall->getRParenLoc());
+  }
+  return ActOnCallExpr(getCurScope(), Args.front(), TheCall->getBeginLoc(),
+                       Args.drop_front(), TheCall->getRParenLoc());
+}
+
 bool Sema::BuiltinPrefetch(CallExpr *TheCall) {
   unsigned NumArgs = TheCall->getNumArgs();
 
diff --git a/clang/lib/Sema/SemaExprCXX.cpp b/clang/lib/Sema/SemaExprCXX.cpp
index b071c98051bbe..e945b446953c7 100644
--- a/clang/lib/Sema/SemaExprCXX.cpp
+++ b/clang/lib/Sema/SemaExprCXX.cpp
@@ -6540,67 +6540,70 @@ ExprResult Sema::ActOnTypeTrait(TypeTrait Kind, SourceLocation KWLoc,
   return BuildTypeTrait(Kind, KWLoc, ConvertedArgs, RParenLoc);
 }
 
-static bool EvaluateBinaryTypeTrait(Sema &Self, TypeTrait BTT, const TypeSourceInfo *Lhs,
-                                    const TypeSourceInfo *Rhs, SourceLocation KeyLoc) {
-  QualType LhsT = Lhs->getType();
-  QualType RhsT = Rhs->getType();
+bool Sema::BuiltinIsBaseOf(SourceLocation RhsTLoc, QualType LhsT,
+                           QualType RhsT) {
+  // C++0x [meta.rel]p2
+  // Base is a base class of Derived without regard to cv-qualifiers or
+  // Base and Derived are not unions and name the same class type without
+  // regard to cv-qualifiers.
+
+  const RecordType *lhsRecord = LhsT->getAs<RecordType>();
+  const RecordType *rhsRecord = RhsT->getAs<RecordType>();
+  if (!rhsRecord || !lhsRecord) {
+    const ObjCObjectType *LHSObjTy = LhsT->getAs<ObjCObjectType>();
+    const ObjCObjectType *RHSObjTy = RhsT->getAs<ObjCObjectType>();
+    if (!LHSObjTy || !RHSObjTy)
+      return false;
 
-  assert(!LhsT->isDependentType() && !RhsT->isDependentType() &&
-         "Cannot evaluate traits of dependent types");
+    ObjCInterfaceDecl *BaseInterface = LHSObjTy->getInterface();
+    ObjCInterfaceDecl *DerivedInterface = RHSObjTy->getInterface();
+    if (!BaseInterface || !DerivedInterface)
+      return false;
 
-  switch(BTT) {
-  case BTT_IsBaseOf: {
-    // C++0x [meta.rel]p2
-    // Base is a base class of Derived without regard to cv-qualifiers or
-    // Base and Derived are not unions and name the same class type without
-    // regard to cv-qualifiers.
-
-    const RecordType *lhsRecord = LhsT->getAs<RecordType>();
-    const RecordType *rhsRecord = RhsT->getAs<RecordType>();
-    if (!rhsRecord || !lhsRecord) {
-      const ObjCObjectType *LHSObjTy = LhsT->getAs<ObjCObjectType>();
-      const ObjCObjectType *RHSObjTy = RhsT->getAs<ObjCObjectType>();
-      if (!LHSObjTy || !RHSObjTy)
-        return false;
+    if (RequireCompleteType(RhsTLoc, RhsT,
+                            diag::err_incomplete_type_used_in_type_trait_expr))
+      return false;
 
-      ObjCInterfaceDecl *BaseInterface = LHSObjTy->getInterface();
-      ObjCInterfaceDecl *DerivedInterface = RHSObjTy->getInterface();
-      if (!BaseInterface || !DerivedInterface)
-        return false;
+    return BaseInterface->isSuperClassOf(DerivedInterface);
+  }
 
-      if (Self.RequireCompleteType(
-              Rhs->getTypeLoc().getBeginLoc(), RhsT,
-              diag::err_incomplete_type_used_in_type_trait_expr))
-        return false;
+  assert(Context.hasSameUnqualifiedType(LhsT, RhsT) ==
+         (lhsRecord == rhsRecord));
 
-      return BaseInterface->isSuperClassOf(DerivedInterface);
-    }
+  // Unions are never base classes, and never have base classes.
+  // It doesn't matter if they are complete or not. See PR#41843
+  if (lhsRecord && lhsRecord->getDecl()->isUnion())
+    return false;
+  if (rhsRecord && rhsRecord->getDecl()->isUnion())
+    return false;
+
+  if (lhsRecord == rhsRecord)
+    return true;
 
-    assert(Self.Context.hasSameUnqualifiedType(LhsT, RhsT)
-             == (lhsRecord == rhsRecord));
+  // C++0x [meta.rel]p2:
+  //   If Base and Derived are class types and are different types
+  //   (ignoring possible cv-qualifiers) then Derived shall be a
+  //   complete type.
+  if (RequireCompleteType(RhsTLoc, RhsT,
+                          diag::err_incomplete_type_used_in_type_trait_expr))
+    return false;
 
-    // Unions are never base classes, and never have base classes.
-    // It doesn't matter if they are complete or not. See PR#41843
-    if (lhsRecord && lhsRecord->getDecl()->isUnion())
-      return false;
-    if (rhsRecord && rhsRecord->getDecl()->isUnion())
-      return false;
+  return cast<CXXRecordDecl>(rhsRecord->getDecl())
+    ->isDerivedFrom(cast<CXXRecordDecl>(lhsRecord->getDecl()));
+}
 
-    if (lhsRecord == rhsRecord)
-      return true;
+static bool EvaluateBinaryTypeTrait(Sema &Self, TypeTrait BTT, const TypeSourceInfo *Lhs,
+                                    const TypeSourceInfo *Rhs, SourceLocation KeyLoc) {
+  QualType LhsT = Lhs->getType();
+  QualType RhsT = Rhs->getType();
 
-    // C++0x [meta.rel]p2:
-    //   If Base and Derived are class types and are different types
-    //   (ignoring possible cv-qualifiers) then Derived shall be a
-    //   complete type.
-    if (Self.RequireCompleteType(
-            Rhs->getTypeLoc().getBeginLoc(), RhsT,
-            diag::err_incomplete_type_used_in_type_trait_expr))
-      return false;
+  assert(!LhsT->isDependentType() && !RhsT->isDependentType() &&
+         "Cannot evaluate traits of dependent types");
+
+  switch(BTT) {
+  case BTT_IsBaseOf:
+    return Self.BuiltinIsBaseOf(Rhs->getTypeLoc().getBeginLoc(), LhsT, RhsT);
 
-    return cast<CXXRecordDecl>(rhsRecord->getDecl())
-      ->isDerivedFrom(cast<CXXRecordDecl>(lhsRecord->getDecl()));
-  }
   case BTT_IsVirtualBaseOf: {
     const RecordType *BaseRecord = LhsT->getAs<RecordType>();
     const RecordType *DerivedRecord = RhsT->getAs<RecordType>();
diff --git a/clang/test/CodeGenCXX/builtin-invoke.cpp b/clang/test/CodeGenCXX/builtin-invoke.cpp
new file mode 100644
index 0000000000000..af66dfd4dae30
--- /dev/null
+++ b/clang/test/CodeGenCXX/builtin-invoke.cpp
@@ -0,0 +1,61 @@
+// RUN: %clang_cc1 -triple=x86_64-linux-gnu -emit-llvm -o - %s | FileCheck %s
+
+extern "C" void* memcpy(void*, const void*, decltype(sizeof(int)));
+void func();
+
+namespace std {
+  template <class T>
+  class reference_wrapper {
+    T* ptr;
+
+  public:
+    T& get() { return *ptr; }
+  };
+} // namespace std
+
+struct Callable {
+  void operator()() {}
+
+  void func();
+};
+
+extern "C" void call1() {
+  __builtin_invoke(func);
+  __builtin_invoke(Callable{});
+  __builtin_invoke(memcpy, nullptr, nullptr, 0);
+
+  // CHECK:      define dso_local void @call1
+  // CHECK-NEXT: entry:
+  // CHECK-NEXT:   %ref.tmp = alloca %struct.Callable, align 1
+  // CHECK-NEXT:   call void @_Z4funcv()
+  // CHECK-NEXT:   call void @_ZN8CallableclEv(ptr noundef nonnull align 1 dereferenceable(1) %ref.tmp)
+  // CHECK-NEXT:   call void @llvm.memcpy.p0.p0.i64(ptr align 1 null, ptr align 1 null, i64 0, i1 false)
+  // CHECK-NEXT:   ret void
+}
+
+extern "C" void call_memptr(std::reference_wrapper<Callable> wrapper) {
+  __builtin_invoke(&Callable::func, wrapper);
+
+  // CHECK:      define dso_local void @call_memptr
+  // CHECK-NEXT: entry:
+  // CHECK-NEXT:   %wrapper = alloca %"class.std::reference_wrapper", align 8
+  // CHECK-NEXT:   %coerce.dive = getelementptr inbounds nuw %"class.std::reference_wrapper", ptr %wrapper, i32 0, i32 0
+  // CHECK-NEXT:   store ptr %wrapper.coerce, ptr %coerce.dive, align 8
+  // CHECK-NEXT:   %call = call noundef nonnull align 1 dereferenceable(1) ptr @_ZNSt17reference_wrapperI8CallableE3getEv(ptr noundef nonnull align 8 dereferenceable(8) %wrapper)
+  // CHECK-NEXT:   %0 = getelementptr inbounds i8, ptr %call, i64 0
+  // CHECK-NEXT:   br i1 false, label %memptr.virtual, label %memptr.nonvirtual
+  // CHECK-EMPTY:
+  // CHECK-NEXT: memptr.virtual:
+  // CHECK-NEXT:   %vtable = load ptr, ptr %0, align 8
+  // CHECK-NEXT:   %1 = getelementptr i8, ptr %vtable, i64 sub (i64 ptrtoint (ptr @_ZN8Callable4funcEv to i64), i64 1), !nosanitize !2
+  // CHECK-NEXT:   %memptr.virtualfn = load ptr, ptr %1, align 8, !nosanitize !2
+  // CHECK-NEXT:   br label %memptr.end
+  // CHECK-EMPTY:
+  // CHECK-NEXT: memptr.nonvirtual:
+  // CHECK-NEXT:   br label %memptr.end
+  // CHECK-EMPTY:
+  // CHECK-NEXT: memptr.end:
+  // CHECK-NEXT:   %2 = phi ptr [ %memptr.virtualfn, %memptr.virtual ], [ @_ZN8Callable4funcEv, %memptr.nonvirtual ]
+  // CHECK-NEXT:   call void %2(ptr noundef nonnull align 1 dereferenceable(1) %0)
+  // CHECK-NEXT:   ret void
+}
diff --git a/clang/test/SemaCXX/builtin-invoke.cpp b/clang/test/SemaCXX/builtin-invoke.cpp
new file mode 100644
index 0000000000000..5b156b5ff75c4
--- /dev/null
+++ b/clang/test/SemaCXX/builtin-invoke.cpp
@@ -0,0 +1,133 @@
+// RUN: %clang_cc1 -verify -fsyntax-only %s
+
+void func() { // expected-note {{'func' declared here}}
+  __builtin_invoke(); // expected-error {{too few arguments to function call, expected at least 1, have 0}}
+}
+
+void nfunc() noexcept {}
+
+struct S {};
+void argfunc(int, S) {} // expected-note {{'argfunc' declared here}}
+
+struct Callable {
+  void operator()() {}
+
+  void func() {}
+
+  int var;
+};
+
+void* malloc(decltype(sizeof(int)));
+
+template <class T>
+struct pointer_wrapper {
+  T* v;
+
+  T& operator*() {
+    return *v;
+  }
+};
+
+namespace std {
+  template <class T>
+  class reference_wrapper {
+    T* ptr;
+
+  public:
+    reference_wrapper(T& ref) : ptr(&ref) {}
+
+    T& get() { return *ptr; }
+  };
+
+  template <class T>
+  reference_wrapper<T> ref(T& v) {
+    return reference_wrapper<T>(v);
+  }
+} // namespace std
+
+struct InvalidSpecialization1 {
+  void func() {}
+
+  int var;
+};
+
+template <>
+class std::reference_wrapper<InvalidSpecialization1> {
+public:
+  reference_wrapper(InvalidSpecialization1&) {}
+};
+
+struct InvalidSpecialization2 {
+  void func() {}
+
+  int var;
+};
+
+template <>
+class std::reference_wrapper<InvalidSpecialization2> {
+public:
+  reference_wrapper(InvalidSpecialization2&) {}
+
+private:
+  InvalidSpecialization2& get(); // expected-note 2 {{declared private here}}
+};
+
+void call() {
+  __builtin_invoke(func);
+  __builtin_invoke(nfunc);
+  static_assert(!noexcept(__builtin_invoke(func)));
+  static_assert(noexcept(__builtin_invoke(nfunc)));
+  __builtin_invoke(func, 1); // expected-error {{too many arguments to function call, expected 0, have 1}}
+  __builtin_invoke(argfunc, 1); // expected-error {{too few arguments to function call, expected 2, have 1}}
+  __builtin_invoke(Callable{});
+  __builtin_invoke(malloc, 0);
+  __builtin_invoke(__builtin_malloc, 0); // expected-error {{builtin functions must be directly called}}
+
+  // Member functiom pointer
+  __builtin_invoke(&Callable::func); // expected-error {{too few arguments to function call, expected at least 2, have 1}}
+  __builtin_invoke(&Callable::func, 1); // expected-error {{indirection requires pointer operand ('int' invalid)}}
+  __builtin_invoke(&Callable::func, Callable{});
+  __builtin_invoke(&Callable::func, Callable{}, 1); // expected-error {{too many arguments to function call, expected 0, have 1}}
+
+  Callable c;
+  __builtin_invoke(&Callable::func, &c);
+  __builtin_invoke(&Callable::func, std::ref(c));
+  __builtin_invoke(&Callable::func, &c);
+  __builtin_invoke(&Callable::func, &c, 2); // expected-error {{too many arguments to function call, expected 0, have 1}}
+  __builtin_invoke(&Callable::func, pointer_wrapper<Callable>{&c});
+  __builtin_invoke(&Callable::func, pointer_wrapper<Callable>{&c}, 2); // expected-error {{too many arguments to function call, expected 0, have 1}}
+
+  InvalidSpecialization1 is1;
+  InvalidSpecialization2 is2;
+  __builtin_invoke(&InvalidSpecialization1::func, std::ref(is1)); // expected-error {{no member named 'get' in 'std::reference_wrapper<InvalidSpecialization1>'}}
+  __builtin_invoke(&InvalidSpecialization2::func, std::ref(is2)); // expected-error {{'get' is a private member of 'std::reference_wrapper<InvalidSpecialization2>'}}
+
+  // Member data pointer
+  __builtin_invoke(&Callable::var); // expected-error {{too few arguments to function call, expected at least 2, have 1}}
+  __builtin_invoke(&Callable::var, 1); // expected-error {{indirection requires pointer operand ('int' invalid)}}
+  (void)__builtin_invoke(&Callable::var, Callable{});
+  __builtin_invoke(&Callable::var, Callable{}, 1); // expected-error {{too many arguments to function call, expected 2, have 3}}
+
+  (void)__builtin_invoke(&Callable::var, &c);
+  (void)__builtin_invoke(&Callable::var, std::ref(c));
+  (void)__builtin_invoke(&Callable::var, &c);
+  __builtin_invoke(&Callable::var, &c, 2); // expected-error {{too many arguments to function call, expected 2,...
[truncated]

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of nits, else I'm reasonably ok with this. Interested to hear what @AaronBallman has to say, else I think this is beneficial.

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy as-is. Please give Aaron a chance to poke if he wants to.

@philnik777 philnik777 merged commit 7138397 into llvm:main Jun 29, 2025
111 of 118 checks passed
@philnik777 philnik777 deleted the builtin_invoke branch June 29, 2025 15:52
@EricWF
Copy link
Member

EricWF commented Jun 29, 2025

@philnik777 I'm encountering a number of test failures related to libc++'s clang-tidy module when running the libc++ test suite with the new compiler:

# .---command stdout------------
# | /home/eric/llvm-project/build/libcxx/libcxx/test-suite-install/include/c++/v1/__type_traits/invoke.h:68:1: error: Internal aliases should always be marked _LIBCPP_NODEBUG [libcpp-nodebug-on-aliases,-warnings-as-errors]
# |    68 | using __invoke_result_t = decltype(__builtin_invoke(std::declval<_Args>()...));
# |       | ^
# | /home/eric/llvm-project/build/libcxx/libcxx/test-suite-install/include/c++/v1/__type_traits/invoke.h:79:1: error: Internal aliases should always be marked _LIBCPP_NODEBUG [libcpp-nodebug-on-aliases,-warnings-as-errors]
# |    79 | using __invoke_result = __invoke_result_impl<void, _Args...>;
# |       | ^

Could you please take a look at these?

template <class... Args>
concept invocable = requires(Args... args) { __builtin_invoke(args...); };

static_assert(!invocable<std::reference_wrapper<InvalidSpecialization1>>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a positive test for invocable?


namespace std {
template <class... Args>
auto invoke(Args&&... args) -> decltype(__builtin_invoke(args...));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere in the tests?

@philnik777
Copy link
Contributor Author

@philnik777 I'm encountering a number of test failures related to libc++'s clang-tidy module when running the libc++ test suite with the new compiler:

# .---command stdout------------
# | /home/eric/llvm-project/build/libcxx/libcxx/test-suite-install/include/c++/v1/__type_traits/invoke.h:68:1: error: Internal aliases should always be marked _LIBCPP_NODEBUG [libcpp-nodebug-on-aliases,-warnings-as-errors]
# |    68 | using __invoke_result_t = decltype(__builtin_invoke(std::declval<_Args>()...));
# |       | ^
# | /home/eric/llvm-project/build/libcxx/libcxx/test-suite-install/include/c++/v1/__type_traits/invoke.h:79:1: error: Internal aliases should always be marked _LIBCPP_NODEBUG [libcpp-nodebug-on-aliases,-warnings-as-errors]
# |    79 | using __invoke_result = __invoke_result_impl<void, _Args...>;
# |       | ^

Could you please take a look at these?

Urgh, looks like clang-tidy doesn't run in the bootstrapping build. I'll update my local clang and fix any issues.

rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
`std::invoke` is currently quite heavy compared to a function call,
since it involves quite heavy SFINAE. This can be done significantly
more efficient by the compiler, since most calls to `std::invoke` are
simple function calls and 6 out of the seven overloads for `std::invoke`
exist only to support member pointers. Even these boil down to a few
relatively simple checks.

Some real-world testing with this patch revealed some significant
results. For example, instantiating `std::format("Banane")` (and its
callees) went down from ~125ms on my system to ~104ms.
rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
`std::invoke` is currently quite heavy compared to a function call,
since it involves quite heavy SFINAE. This can be done significantly
more efficient by the compiler, since most calls to `std::invoke` are
simple function calls and 6 out of the seven overloads for `std::invoke`
exist only to support member pointers. Even these boil down to a few
relatively simple checks.

Some real-world testing with this patch revealed some significant
results. For example, instantiating `std::format("Banane")` (and its
callees) went down from ~125ms on my system to ~104ms.
Artem-B added a commit to Artem-B/llvm-project that referenced this pull request Jul 15, 2025
Since llvm#116709 more libc++ code relies on std::declval() and it broke some CUDA compilations.

The new wrapper adds GPU-side overloads for the declval() helper functions
which allows it to continue working when used from CUDA sources.
Artem-B added a commit to Artem-B/llvm-project that referenced this pull request Jul 15, 2025
Since llvm#116709 more libc++ code relies on std::declval() and it broke some CUDA compilations.

The new wrapper adds GPU-side overloads for the declval() helper functions
which allows it to continue working when used from CUDA sources.
Artem-B added a commit that referenced this pull request Jul 15, 2025
Since #116709 more libc++ code relies on std::declval() and it broke
some CUDA compilations.

The new wrapper adds GPU-side overloads for the declval() helper
functions which allows it to continue working when used from CUDA
sources.
@alexfh
Copy link
Contributor

alexfh commented Jul 22, 2025

This breaks some valid code. Here a partially reduced test case that depends on libc++ and Google Test: https://gcc.godbolt.org/z/dPznsz44M

#include <variant>
#include <vector>

#include <gmock/gmock.h>

template <typename T>
class Nullable {
 public:
  Nullable() : value_() {}
  template <typename V>
  Nullable(V value) : value_(value) {}

  const T& value() const { return value_; }

 private:
  T value_;
};

template <typename L, typename R>
inline bool operator== (const Nullable<L>& lhs, const Nullable<R>& rhs) {
  return lhs.value() == rhs.value();
}

void f() {
  std::vector<std::variant<int, Nullable<int>>> v;
  ASSERT_THAT(v, ::testing::ElementsAre(16));
}

It used to compile fine, but now it triggers a compilation error:

/opt/compiler-explorer/clang-trunk-20250722/bin/../include/c++/v1/variant:1186:37: fatal error: recursive template instantiation exceeded maximum depth of 1024
 1186 |              enable_if_t<!is_same_v<__remove_cvref_t<_Arg>, variant>, int>        = 0,
      |                                     ^
/opt/compiler-explorer/clang-trunk-20250722/bin/../include/c++/v1/variant:1192:35: note: while substituting prior template arguments into non-type template parameter [with _Arg = testing::Matcher<const std::variant<int, Nullable<int>> &>]
 1192 |   _LIBCPP_HIDE_FROM_ABI constexpr variant(_Arg&& __arg) noexcept(is_nothrow_constructible_v<_Tp, _Arg>)
      |                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1193 |       : __impl_(in_place_index<_Ip>, std::forward<_Arg>(__arg)) {}
      |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/clang-trunk-20250722/bin/../include/c++/v1/__type_traits/invoke.h:71:69: note: while substituting deduced template arguments into function template 'variant' [with _Arg = testing::Matcher<const std::variant<int, Nullable<int>> &>, $1 = (no value), $2 = (no value), $3 = (no value), _Tp = (no value), _Ip = (no value), $6 = (no value)]
   71 | using __invoke_result_t _LIBCPP_NODEBUG = decltype(__builtin_invoke(std::declval<_Args>()...));
      |                                                                     ^
/opt/compiler-explorer/clang-trunk-20250722/bin/../include/c++/v1/__type_traits/invoke.h:387:1: note: in instantiation of template type alias '__invoke_result_t' requested here
  387 | using invoke_result_t = __invoke_result_t<_Fn, _Args...>;
      | ^
/opt/compiler-explorer/clang-trunk-20250722/bin/../include/c++/v1/variant:1140:1: note: in instantiation of template type alias 'invoke_result_t' requested here
 1140 | using __best_match_t _LIBCPP_NODEBUG = typename invoke_result_t<_MakeOverloads<_Types...>, _Tp, _Tp>::type;
      | ^
/opt/compiler-explorer/clang-trunk-20250722/bin/../include/c++/v1/variant:1189:45: note: in instantiation of template type alias '__best_match_t' requested here
 1189 |              class _Tp  = __variant_detail::__best_match_t<_Arg, _Types...>,
      |                                             ^
/opt/compiler-explorer/clang-trunk-20250722/bin/../include/c++/v1/variant:1192:35: note: (skipping 2533 contexts in backtrace; use -ftemplate-backtrace-limit=0 to see all)
 1192 |   _LIBCPP_HIDE_FROM_ABI constexpr variant(_Arg&& __arg) noexcept(is_nothrow_constructible_v<_Tp, _Arg>)
      |                                   ^
/opt/compiler-explorer/libs/googletest/trunk/googlemock/include/gmock/gmock-matchers.h:356:12: note: in instantiation of function template specialization 'testing::internal::MatcherCastImpl<const std::vector<std::variant<int, Nullable<int>>> &, testing::internal::ElementsAreMatcher<std::tuple<int>>>::CastImpl<false>' requested here
  356 |     return CastImpl(polymorphic_matcher_or_value,
      |            ^
/opt/compiler-explorer/libs/googletest/trunk/googlemock/include/gmock/gmock-matchers.h:529:43: note: in instantiation of member function 'testing::internal::MatcherCastImpl<const std::vector<std::variant<int, Nullable<int>>> &, testing::internal::ElementsAreMatcher<std::tuple<int>>>::Cast' requested here
  529 |   return internal::MatcherCastImpl<T, M>::Cast(matcher);
      |                                           ^
/opt/compiler-explorer/libs/googletest/trunk/googlemock/include/gmock/gmock-matchers.h:536:10: note: in instantiation of function template specialization 'testing::MatcherCast<const std::vector<std::variant<int, Nullable<int>>> &, testing::internal::ElementsAreMatcher<std::tuple<int>>>' requested here
  536 |   return MatcherCast<T>(polymorphic_matcher_or_value);
      |          ^
/opt/compiler-explorer/libs/googletest/trunk/googlemock/include/gmock/gmock-matchers.h:1661:39: note: in instantiation of function template specialization 'testing::SafeMatcherCast<const std::vector<std::variant<int, Nullable<int>>> &, testing::internal::ElementsAreMatcher<std::tuple<int>>>' requested here
 1661 |     const Matcher<const T&> matcher = SafeMatcherCast<const T&>(matcher_);
      |                                       ^
<source>:26:3: note: in instantiation of function template specialization 'testing::internal::PredicateFormatterFromMatcher<testing::internal::ElementsAreMatcher<std::tuple<int>>>::operator()<std::vector<std::variant<int, Nullable<int>>>>' requested here
   26 |   ASSERT_THAT(v, ::testing::ElementsAre(16));
      |   ^
/opt/compiler-explorer/libs/googletest/trunk/googlemock/include/gmock/gmock-matchers.h:5736:3: note: expanded from macro 'ASSERT_THAT'
 5736 |   ASSERT_PRED_FORMAT1(              \
      |   ^
/opt/compiler-explorer/libs/googletest/trunk/googletest/include/gtest/gtest_pred_impl.h:112:3: note: expanded from macro 'ASSERT_PRED_FORMAT1'
  112 |   GTEST_PRED_FORMAT1_(pred_format, v1, GTEST_FATAL_FAILURE_)
      |   ^
/opt/compiler-explorer/libs/googletest/trunk/googletest/include/gtest/gtest_pred_impl.h:100:28: note: expanded from macro 'GTEST_PRED_FORMAT1_'
  100 |   GTEST_ASSERT_(pred_format(#v1, v1), on_failure)
      |                            ^
/opt/compiler-explorer/libs/googletest/trunk/googletest/include/gtest/gtest_pred_impl.h:79:52: note: expanded from macro 'GTEST_ASSERT_'
   79 |   if (const ::testing::AssertionResult gtest_ar = (expression)) \
      |                                                    ^
1 error generated.

@philnik777 please take a look.

@philnik777
Copy link
Contributor Author

@alexfh Could you maybe provide a reproducer without a dependency on Google Test? I don't easily have access to that, and it shouldn't be too hard to remove that. It looks to me like it's instantiating the variant constructor with something, but unfortunately the diagnostics are cut off at exactly that point.

@alexfh
Copy link
Contributor

alexfh commented Jul 23, 2025

I'm trying to unwrap this more, so far I got to this point: https://gcc.godbolt.org/z/q76YxfKzP

Hopefully I can leave only libc++ includes soon.

@alexfh
Copy link
Contributor

alexfh commented Jul 24, 2025

I'm trying to unwrap this more, so far I got to this point: https://gcc.godbolt.org/z/q76YxfKzP

Hopefully I can leave only libc++ includes soon.

It turned out to be trickier than I thought, but the test is being automatically reduced. I'll post the result when it's ready.

@alexfh
Copy link
Contributor

alexfh commented Jul 24, 2025

The reduced test case: https://gcc.godbolt.org/z/KhebWaGsc

#include <variant>
#include <vector>
namespace testing {
template <typename T> struct Matcher {
  Matcher(int);
  Matcher(T);
};
template <typename T, typename M>
Matcher<T> SafeMatcherCast(M polymorphic_matcher_or_value) {
  return polymorphic_matcher_or_value;
}
namespace internal {
template <typename MatcherTuple> struct ElementsAreMatcher {
  template <typename Container> operator Matcher<Container>() {
    typedef typename std::remove_const<
        typename std::remove_reference<Container>::type>::type ::value_type
        Element;
    typedef std::vector<Matcher<Element>> MatcherVec;
    MatcherVec matchers;
    matchers.reserve(std::tuple_size<MatcherTuple>::value);
    return 0;
  }
};
} // namespace internal
} // namespace testing
template <typename> struct Nullable {
  template <typename V> Nullable(V);
};
void f(testing::internal::ElementsAreMatcher<std::tuple<>> m) {
  testing::SafeMatcherCast<std::vector<std::variant<Nullable<int>>>>(m);
}
In file included from <source>:1:
/opt/compiler-explorer/clang-trunk-20250723/bin/../include/c++/v1/variant:1186:37: fatal error: recursive template instantiation exceeded maximum depth of 1024
 1186 |              enable_if_t<!is_same_v<__remove_cvref_t<_Arg>, variant>, int>        = 0,
      |                                     ^
/opt/compiler-explorer/clang-trunk-20250723/bin/../include/c++/v1/variant:1192:35: note: while substituting prior template arguments into non-type template parameter [with _Arg = testing::Matcher<std::variant<Nullable<int>>>]
 1192 |   _LIBCPP_HIDE_FROM_ABI constexpr variant(_Arg&& __arg) noexcept(is_nothrow_constructible_v<_Tp, _Arg>)
      |                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1193 |       : __impl_(in_place_index<_Ip>, std::forward<_Arg>(__arg)) {}
      |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/compiler-explorer/clang-trunk-20250723/bin/../include/c++/v1/__type_traits/invoke.h:71:69: note: while substituting deduced template arguments into function template 'variant' [with _Arg = testing::Matcher<std::variant<Nullable<int>>>, $1 = (no value), $2 = (no value), $3 = (no value), _Tp = (no value), _Ip = (no value), $6 = (no value)]
   71 | using __invoke_result_t _LIBCPP_NODEBUG = decltype(__builtin_invoke(std::declval<_Args>()...));
      |                                                                     ^
/opt/compiler-explorer/clang-trunk-20250723/bin/../include/c++/v1/__type_traits/invoke.h:387:1: note: in instantiation of template type alias '__invoke_result_t' requested here
  387 | using invoke_result_t = __invoke_result_t<_Fn, _Args...>;
      | ^
/opt/compiler-explorer/clang-trunk-20250723/bin/../include/c++/v1/variant:1140:1: note: in instantiation of template type alias 'invoke_result_t' requested here
 1140 | using __best_match_t _LIBCPP_NODEBUG = typename invoke_result_t<_MakeOverloads<_Types...>, _Tp, _Tp>::type;
      | ^
/opt/compiler-explorer/clang-trunk-20250723/bin/../include/c++/v1/variant:1189:45: note: in instantiation of template type alias '__best_match_t' requested here
 1189 |              class _Tp  = __variant_detail::__best_match_t<_Arg, _Types...>,
      |                                             ^
/opt/compiler-explorer/clang-trunk-20250723/bin/../include/c++/v1/variant:1192:35: note: (skipping 2539 contexts in backtrace; use -ftemplate-backtrace-limit=0 to see all)
 1192 |   _LIBCPP_HIDE_FROM_ABI constexpr variant(_Arg&& __arg) noexcept(is_nothrow_constructible_v<_Tp, _Arg>)
      |                                   ^
/opt/compiler-explorer/clang-trunk-20250723/bin/../include/c++/v1/__vector/vector.h:845:8: note: in instantiation of function template specialization 'std::__uninitialized_allocator_relocate<std::allocator<testing::Matcher<std::variant<Nullable<int>>>>, testing::Matcher<std::variant<Nullable<int>>> *>' requested here
  845 |   std::__uninitialized_allocator_relocate(
      |        ^
/opt/compiler-explorer/clang-trunk-20250723/bin/../include/c++/v1/__vector/vector.h:1101:5: note: in instantiation of member function 'std::vector<testing::Matcher<std::variant<Nullable<int>>>>::__swap_out_circular_buffer' requested here
 1101 |     __swap_out_circular_buffer(__v);
      |     ^
<source>:20:14: note: in instantiation of member function 'std::vector<testing::Matcher<std::variant<Nullable<int>>>>::reserve' requested here
   20 |     matchers.reserve(std::tuple_size<MatcherTuple>::value);
      |              ^
<source>:10:10: note: in instantiation of function template specialization 'testing::internal::ElementsAreMatcher<std::tuple<>>::operator Matcher<std::vector<std::variant<Nullable<int>>>>' requested here
   10 |   return polymorphic_matcher_or_value;
      |          ^
<source>:30:12: note: in instantiation of function template specialization 'testing::SafeMatcherCast<std::vector<std::variant<Nullable<int>>>, testing::internal::ElementsAreMatcher<std::tuple<>>>' requested here
   30 |   testing::SafeMatcherCast<std::vector<std::variant<Nullable<int>>>>(m);
      |            ^

@philnik777 PTAL

@alexfh
Copy link
Contributor

alexfh commented Jul 28, 2025

@philnik777 please let me know, when you will be able to get to this. Thanks!

@philnik777
Copy link
Contributor Author

@alexfh I've been able to take a look now. It looks like we're doing the invoke_result_t resolution more eagerly now, which results in recursively trying to instantiate the same template. I've opened #151028 to track the bug. I have no idea how to classify this bug at this point though. I have no idea whether this is a variant of invoke bug. I don't think it's a problem with the bultin at least...
I'll definitely talk with Louis about it tomorrow or latest on Wednesday though.

@alexfh
Copy link
Contributor

alexfh commented Jul 28, 2025

Thank you for the update! We can work around this for now, and the impact is quite limited, but it would be nice to get this working again relatively soon.

@philnik777
Copy link
Contributor Author

Thank you for the update! We can work around this for now, and the impact is quite limited, but it would be nice to get this working again relatively soon.

Absolutely. This is definitely a bug of some sort in libc++. I hope I'll be able to talk to @ldionne tomorrow so we can talk about this, since I'm genuinely at a loss right now.

@ldionne
Copy link
Member

ldionne commented Jul 30, 2025

I created #151328 to track this. I think we'll fix this in std::variant, at least until we have a better understanding of the issue and that points us to std::invoke being non-conforming after the change. I don't think that's the case but I'm not 100% certain.

ldionne pushed a commit that referenced this pull request Aug 11, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Aug 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants